专利摘要:
A method of detecting document fraud comprising: obtaining (50) a first image of a first document and a second image of a second document; determining in the first and second images regions containing the first and second documents respectively; applying (53) a procedure for detecting areas susceptible to document fraud in the regions of the first image and of the second image readjusted on the first image; dividing (55) each sensitive area detected into a plurality of sub-parts; calculating a measure of dissimilarity between corresponding sub-parts of the first and the second readjusted image; determine if the first document is identical to the second document from dissimilarity measures and if the first document is different from the second document, determine a level of difference between the first and the second document based on a value representative of a proportion different sub-parts; and, detecting fraud when the difference level is below a predetermined threshold.
公开号:FR3086884A1
申请号:FR1859345
申请日:2018-10-09
公开日:2020-04-10
发明作者:Thibault BERGER;Laurent ROSTAING;Alain Rouh
申请人:Idemia Identity and Security France SAS;
IPC主号:
专利说明:

The invention relates to a method for detecting an attempted document fraud, and to a device implementing said method.
Many fraudulent methods are based on the use of forged documents. This is the case, for example, of intrusion methods based on falsified identity documents. This is also the case with frauds consisting in carrying out multiple opening of bank accounts in online banks offering bonuses during account opening. The detection of document fraud is therefore an important issue in terms of security, but also in economic terms.
Some document fraud techniques involve locally modifying an original document to create a new document. For example, these techniques can include replacing a photo in a document or modifying at least one letter or number in a text area of the document. The document obtained is very similar to the original document, but locally contains sufficient differences to defraud.
Techniques exist to avoid these frauds. Some are based on the insertion in documents of security reasons or unforgeable information such as biometric information in passports, this biometric information being encoded in an electronic chip of certain passports. However, these techniques require either having physical access to the document, or using a specific shooting system typically making it possible to produce images in a near infrared spectrum or in ultra-violet fluorescence, which doesn is not possible for online procedures where a user uses for example a smartphone in order to transmit only one or more color image (s) of his documents.
A known solution consists in carrying out a classification of a digitized version of a document in order to know the type (identity card, registration card, driving license, passport). Once the document is classified, we proceed to an extraction of high level information such as text fields (also called OCR fields (Optical Character Recognition: in English terminology) of the name of a technique used to automatically extract characters of a document) or information representative of a face contained in the document. Prior classification makes it easier to extract information relevant to fraud detection. We know where to look for this information in the scanned document. In general, these solutions have the disadvantage of being specific to the document studied and are sensitive to algorithmic errors in extracting high-level information.
The document “Document fraud detection at the border: Preliminary observations on human and machine performance, M. Gariup, G. Soederlind, Proc. Europ. Intelligence and Security Informatics Conference (EISIC), pp. 231-238, Aug 2013 ”presents a fairly broad overview of approaches for the detection of document fraud in the case of border control. The automatic methods proposed in this document are essentially based on an analysis of specific security reasons as well as on the constitution of a database of reference documents which is often a weak point.
Another approach is to calculate a digital signature of an image of the document and save it. This saved signature can then be used later to authenticate another image of this same document. This approach, which is found in patent document FR3047688, requires enrollment of each document to calculate its reference signature. This approach cannot therefore be used if this enrollment and an infrastructure adapted to manage these signatures are not available. The use of signatures makes it possible to determine whether two documents are identical or different but does not make it possible to determine whether the differences are due to fraud.
It is desirable to overcome these drawbacks of the state of the art.
It is particularly desirable to propose a method allowing, without prior knowledge of characteristics of the documents and without prerequisites on the digitization of said documents, to determine if a first document is the result of one or more local modifications of a second document, which would be representative of an attempted document fraud.
According to a first aspect of the invention, the invention relates to a documentary fraud detection method comprising: obtaining a first image of a first document and a second image of a second document; applying an image registration procedure to the second image in order to register it on the first image, the registration procedure being based on a matching of points of interest identified in the first and second images; apply a procedure for detecting areas sensitive to document fraud in the first image and in the second readjusted image; divide each sensitive area detected into a plurality of sub-parts and, for each sub-part, calculate a signature representative of the content of said sub-part; for each sub-part of the first image, search for a corresponding sub-part spatially in the second readjusted image, and for each sub-part of the first image having a corresponding sub-part in the second image, calculate a measure of local dissimilarity between the corresponding sub-parts from the signatures; determining that the first and second documents are identical when a global dissimilarity measure determined from a probability distribution of the local dissimilarities measures is less than a first predetermined threshold and, when the first document is different from the second document; determining a level of difference between the first and second documents as a function of a value representative of a proportion of pixels of the first image located in sensitive areas belonging to a sub-part having a corresponding sub-part in the second image whose measure of dissimilarity is greater than a second predetermined threshold; and, detecting fraud when the difference level is less than a third predetermined threshold.
The process of the invention is completely generic since it does not require any prior knowledge of the documents analyzed, the devices having acquired the images of the documents, the angle of view of the documents and the illumination of the documents. at the time of shooting.
According to one embodiment, the registration procedure comprises: determining that no fraud has been detected when a value representative of an efficiency of matching points of interest between the first and second images is less than a predetermined threshold.
Thus, the registration method makes it possible to discard documents which would be of different types.
According to one embodiment, the procedure for detecting sensitive areas comprises a procedure for detecting a face in the first image and the second registration image and / or a procedure for detecting text areas in the first image and the second registration image.
According to one embodiment, the global dissimilarity measure is such that an integral of a probability distribution of the local dissimilarity measures between the first image and the second image readjusted, said integral being calculated between the global dissimilarity measure and a maximum value of the local dissimilarity measures in said distribution, is greater than a fourth predetermined threshold, said fourth predetermined threshold being equal to a second predetermined percentage of an integral of the probability distribution of the local dissimilarity measures.
According to one embodiment, to determine a level of difference between the first and second documents, the method comprises: comparing the dissimilarity measure of each sub-part of the first image matched with a sub-part of the second image with the second predetermined threshold and classify the pixels of an image, called the intermediate image, taken from the first or second image, belonging to sub-parts for which the dissimilarity measure is greater than the second predetermined threshold in a class of pixels considered to be dissimilar; forming regions of dissimilar pixels from pixels classified in the class of pixels considered to be dissimilar; calculating a sum of the areas of regions of dissimilar pixels and obtaining a dissimilarity indicator representative of said level of difference by dividing this sum by a sum of the areas of sensitive areas.
According to one embodiment, each sum is a weighted sum, each area being weighted with a predetermined weight depending on an importance assigned to the region of dissimilar pixels or to the sensitive area corresponding to the calculated area.
According to one embodiment, the importance assigned to a region of dissimilar pixels or to a sensitive area is predefined according to a type of content of the area.
According to one embodiment, the regions of dissimilar pixels whose dimension is less than a predetermined dimension are not taken into account in the calculation of the sum of the areas of the regions of dissimilar pixels.
According to one embodiment, the method comprises: applying a segmentation procedure to the first and to the second image, said segmentation procedure making it possible to detect quadrilaterals in an image, each quadrilateral found in an image being considered to be part of the document contained in said image, the image registration procedure and the procedure for detecting sensitive areas being carried out in the quadrilaterals identified by the segmentation procedure.
According to one embodiment, the detection of areas susceptible to document fraud in the first image and in the second readjusted image is executed in portions of the first image and of the second readjusted image comprising points of interest matched.
According to one embodiment, each sensitive zone is a rectangle and is divided (55) according to a regular tiling, each sub-part being a rectangle whose length is a third predefined percentage of a length of the rectangle representing the sensitive zone detected in which the sub-part is and whose width is a fourth predefined percentage of a width of said rectangle.
According to a second aspect of the invention, the invention relates to a device for detecting document fraud comprising: means for obtaining a first image of a first document and a second image of a second document; processing means for applying an image registration procedure to the second image in order to register it on the first image, the registration procedure being based on a matching of points of interest identified in the first and second images ; processing means for applying a procedure for detecting areas sensitive to document fraud in the first image and in the second readjusted image; processing means for dividing each sensitive area detected into a plurality of sub-parts and, for each sub-part, for calculating a signature representative of the content of said sub-part; calculation means for searching, for each subpart of the first image, a spatially corresponding part in the second readjusted image, and for each part of the first image having a corresponding part in the second image, calculation means to calculate a measure of local dissimilarity between the corresponding subparts from the signatures; determination means for determining that the first and second documents are identical when a global dissimilarity measure determined from a probability distribution of the local dissimilarity measures is less than a first predetermined threshold; determination means for determining, when the first document is different from the second document, a level of difference between the first and the second documents as a function of a value representative of a proportion of pixels of the first image located in sensitive areas belonging to a sub-part having a corresponding sub-part in the second image whose dissimilarity measure is greater than a second predetermined threshold; and, detection means for detecting fraud when the level of difference is less than a third predetermined threshold.
According to a third aspect of the invention, the invention relates to a computer program, comprising instructions for implementing, by a device, the method according to the first aspect, when said program is executed by a processor of said device.
According to a fourth aspect of the invention, the invention relates to storage means storing a computer program comprising instructions for implementing, by a device, the method according to the first aspect, when said program is executed by a processor of said device.
The characteristics of the invention mentioned above, as well as others, will appear more clearly on reading the following description of an exemplary embodiment, said description being made in relation to the accompanying drawings, among which:
- Fig. 1 schematically illustrates an example of a document;
- Fig. 2A illustrates an image of a first document and FIG. 2B illustrates an image of a second document, the second document resulting from a falsification of the first document;
- Fig. 3A schematically illustrates a device implementing the invention;
- Fig. 3B schematically illustrates an example of hardware architecture of a processing module used to implement the invention;
- Fig. 4 schematically illustrates a method for determining that a document results from a falsification of another document;
- Fig. 5 schematically illustrates an example of an image registration procedure;
- Fig. 6 illustrates in detail a procedure for detecting areas susceptible to documentary fraud;
- Fig. 7 schematically describes an exemplary method for determining whether two images correspond to the same document, to identical documents, or to the same document but with fraudulent alteration;
- Fig. 8 illustrates a matching of points of interest identified in a first and a second image; and,
- Figs. 9 and 10 schematically illustrate a result of the division into subparties of sensitive areas detected respectively in a first image and in a second image readjusted on the first image.
The invention is described below in a context where the falsified document is an identity card. The invention is however suitable for other types of documents such as driving licenses, registration cards, passports, etc. Furthermore, the invention is implemented by a device such as a computer obtaining documents via a communication network. The invention can however be implemented by other devices and in particular devices having image acquisition capacities such as a smart phone ("smartphone" in English terminology) or a tablet. In addition, the invention is applied to images of documents that can represent documents viewed from the front or in perspective. A perspective view of a document shows a distorted document. An initially rectangular document can then become any quadrilateral. However, the invention can also be applied to documents having undergone more complex deformations, such as folded, crumpled, non-planar documents, etc.
Fig. 1 schematically illustrates an example of a document.
The document of FIG. 1 is a schematic identity card comprising several text fields such as a field 10 comprising a number, a field 11 comprising a name, a field 12 comprising a first name, a field 13 comprising a date of birth, and fields 14 and 15 containing information contained in fields 10 to 13 and in particular the name, first name and number. All these fields are customizable since they depend on the owner of the identity card.
Fig. 2A illustrates an image of a first document and FIG. 2B illustrates an image of a second document, the second document resulting from a falsification of the first document.
The documents of Figs. 2A and 2B are identity cards. Figs. 2A and 2B representing real documents, for confidentiality reasons, black masks have been placed on the document fields which could make it possible to recognize the owner of the documents. However, all the treatments applied to images described below, in particular in relation to FIG. 4, are applied to each image in full, without taking into account the black masks. We find in the first and second documents, field 11. As we can see, in the case of the second document, document fraud consisted in modifying field 11. Thus, in field 11, the letter "N" has been replaced by the letter "S" and an letter "A" has been added at the end of field 11.
Fig. 3A schematically illustrates a device implementing the invention.
The device 30 of FIG. 3 A includes a processing module 300 and a display module 301 such as a screen. The display module 301 displays messages intended for users indicating whether an attempt at document fraud has been detected. In one embodiment, the display module 301 displays the analyzed images but no information for detecting an attempted fraud so as not to give instructions to a user on the fraud detection method used. However, the fraud detection information is transmitted by the device 30 to a central system, not shown, so that action can be taken.
Fig. 3B schematically illustrates an example of hardware architecture of the processing module 300 used to implement the invention.
According to the example of hardware architecture shown in FIG. 3B, the processing module 300 then comprises, connected by a communication bus 3000: a processor or CPU (“Central Processing Unit” in English) 3001; a random access memory RAM (“Random Access Memory” in English) 3002; a read only memory (ROM) 3003; a storage unit such as a hard disk or a storage medium reader, such as an SD card reader ("Secure Digital" in English) 3004; at least one communication interface 3005. The communication interface 3005 allows the processing module 300 to receive images for which it must determine whether there has been an attempted fraud, for example from a communication network. Each image can for example be transmitted to the processing module 300 by an image acquisition device such as a camera, a camera, a smart phone.
Processor 3001 is capable of executing instructions loaded into RAM 3002 from ROM 3003, external memory (not shown), storage medium (such as an SD card), or a communication network. When the processing module 3000 is powered up, the processor 3001 is able to read instructions from RAM 3002 and execute them. These instructions form a computer program causing the implementation, by the processor 3001, of all or part of the methods described below in relation to FIG. 4.
The method described below in relation to FIG. 4 can be implemented in software form by execution of a set of instructions by a programmable machine, for example a DSP ("Digital Signal Processor" in English), a microcontroller or a GPU (graphics processor, "Graphies Processing Unit" in English terminology), or be implemented in hardware form by a machine or a dedicated component, for example an FPGA ("Field-Programmable Gate Array" in English) or an ASIC ("Application-Specific Integrated Circuit" in English).
Fig. 4 schematically illustrates a method for determining that a document results from a falsification of another document.
In a step 50, the processing module 300 obtains a first image of a first document and a second image of a second document. The first and second documents can be documents of the same type (for example two identity cards) or of different types (for example an identity card and a passport). The first and second images could be acquired with different or identical image acquisition devices, from different or identical points of view and with different or identical illumination conditions. For example, the first document is the document of FIG. 2A and the second document is the document of FIG. 2B.
It is subsequently assumed that each document covered by this invention is initially of rectangular shape, which is the case with official documents. Furthermore, although the invention is suitable for documents having undergone more complex deformations, such as folded documents, crumpled or not planar, to simplify the description, we are concerned in the following description essentially with deformations due differences in shots. An image taken from any point of view of a rectangle is a quadrilateral.
In a step 52, the processing module 300 applies an image registration procedure to the second image in order to register it on the first image, the registration procedure being based on a matching of points of interest identified in the first and second images.
Fig · 5 schematically illustrates an example of an image registration procedure.
In a step 520, the processing module 300 applies a method of searching for points of interest to the first and to the second images. The method of finding points of interest is for example a method described in the article "SURF: Speeded Up Robust Features, Bay, Tuytelaars, Van Gool, 2006" or the method described in the article "SUSurE: Speeded Up Surround Extrema Feature Detector and Descriptor for realtime time applications, Ebrahimi, Cuevas, 2008 ”. The application of the point of interest search method makes it possible to obtain a list of points of interest for the first and second images, each point of interest of an image being identified by two-dimensional coordinates in the image and by a local descriptor of the content of the image.
In a step 521, the processing module 300 applies a method of matching the points of interest of the first and second images. The point here is to map points of interest from the first image to points of interest from the second image. Thus for each point of interest of the first image, the processing module 300 searches for a corresponding point of interest in the second image. The mapping of points of interest is done for example by correlation measurement between the immediate neighborhoods of the points of interest for which one wishes to determine the correspondence. The vicinity of a point of interest is generally defined in the form of a window of pixels of predefined size around the point of interest. For example, in one embodiment, the mapping method uses a square window with five side pixels. There are various correlation measures. In one embodiment, the matching method uses a correlation measure of the NCC type (normalized cross correlation: “Normalized Cross Correlation” in English terminology) used in the article “Wide-baseline Multiple-View Correspondences, V. Ferrari and al., IEEE Conference Proceedings of Computer Vision and Pattern Recognition, CVPR, Vol. 2, Pages 718-725, Madison, United States, June 2003 "or of ZNCC type (normalized centered cross correlation:" Zero Mean Normalized Cross Correlation "in English terminology) used for example in the article" C. Sun, A Fast Matching Method, proc, of Digital Image Computing Techniques and Applications, DICTA, Pages 95-100, Auckland, New Zealand, December 1997 ”. When a point of interest of the first image has several correspondents in the second image, the processing module 300 chooses the point of interest of the second image offering the highest correlation measure with the point of interest of the first image.
Other more sophisticated POI mapping methods can be used. We can cite for this purpose methods based on calculations of local descriptors and matching by stochastic algorithms such as RANSAC (Consensus of random samples, "RANdom SAmple Consensus" in English terminology) appeared in the document " Martin A. Fischler and Robert C. Bolles, "Random Sample Consensus: A Paradigm for Model Fitting with Applications to Image Analysis and Automated Cartography",
Comm. Of the ACM, vol. 24, June 1981, p. 381-395 ". A method described in the document “Lowe D. G., 1999, Object recognition from local scale invariant features, Proceedings of the International Conference on Computer Vision, vol. 2, p. 1150-1157 ”can also be advantageously applied.
Fig. 8 illustrates a mapping of points of interest identified in a first and a second image.
The image on the left in Fig. 8 corresponds to an image of the first document of FIG. 2A taken from the front. The image on the right in Fig. 8 corresponds to an image of the second document of FIG. 2B taken in perspective. The second document appears distorted. It is no longer a rectangle but a quadrilateral. In Fig. 8, straight lines connect the points of interest detected in the first document to the points of interest detected in the second document. We then realize that a large number of points of interest in the first document have found a correspondent in the second document, which is logical since the two documents are of the same type and that moreover the second document results from a falsification of the first document.
In a step 522, the processing module 300 checks whether the execution of the matching method makes it possible to consider that the two documents are at least similar. To do this, for example, the processing module 300 determines whether a value representative of an efficiency of matching points of interest between the first and the second image is less than a predetermined threshold. For example, if a percentage p of points of interest in the first image for which a corresponding point of interest has been found in the second image is less than a predetermined percentage P. In one embodiment, F = 80%.
If p <P, the processing module 300 considers that the first document does not correspond to the second document. In this case, the processing module terminates the process of FIG. 4 and displays on the display module 301 a message indicating that no fraud has been detected.
The image registration procedure therefore makes it possible to discard documents which are of different types.
Sip> P, during a step 524, the processing module calculates a transformation making it possible to readjust the second image on the first image, ie making it possible to readjust the points of interest of the second image on the corresponding points of interest of The first picture. To do this, the processing module 300 calculates a geometric deformation model of the second document with respect to the first document. Such a geometric deformation model can be:
• an affine model determining an affine application connecting points of interest of the second document to points of interest of the first document such as a translation, rotation or homothety;
• a homographic model determining a homographic application connecting the points of interest of the second document to the points of interest of the first document;
• an interpolation model determined using an inverse distance weighting algorithm, as proposed in the document “Scattered data interpolation tests of some methods, Franke R, mathematical of computation, 38 (157), 182- 200, 1982 "and / or spline interpolation;
In one embodiment, the processing module calculates a combination of different models with local estimation of deformation models. The use of such a combination of models makes it possible to model more complex deformations than with a single model.
In one embodiment, following the calculation of the geometric deformation model (s), each calculated model is applied to the second image. A second readjusted image is then obtained in which the second document is readjusted on the first document.
In a step 53, the processing module 300 applies a procedure for detecting areas sensitive to document fraud in the first image and in the second readjusted image. In the context of the invention, sensitive areas have a very generic definition since they are areas of a real document that a fraudster would have an interest in modifying. The type of document analyzed by the process of the invention is never taken into account in the definition of sensitive areas. In this context, we can consider that in a document, there are two types of areas susceptible to documentary fraud:
• the areas comprising a face image (ie typically an image representing an identity photo);
• areas with text.
Fig · 6 illustrates in detail a procedure for detecting areas susceptible to document fraud.
The procedure described in relation to FIG. 6 corresponds to step 53.
In a step 530, the processing module performs a face detection procedure in the first image and the second image readjusted. The processing module 300 applies for this, for example, the method described in the document "Eigenfaces for Face Detection / Recognition, M. Turk and A. Pentland," eigenfaces for recognition ", journal of cognitive Neuroscience, Vol. 3, No. 1, pp. 71-86, 1991 ”or the method described in the document“ Face detection system on adaboost algorithm using Haar Classifiers, M. Gopi Krishna, A. Srinivasulu, International Journal of Modern Engineering Research (IJMER), Vol. 2, Issue 5, Sep-Oct 2012, pp-3556-3560 ”. If the document included in an image includes a face photo, step 530 makes it possible to obtain coordinates of at least one quadrilateral corresponding to a sensitive area in which a face is located.
In a step 531, the processing module 300 applies a procedure for detecting text zones in the first image and the second readjusted image. The processing module 300 applies for this, for example, one of the methods described in the document "Review on text String Detection from Natural Scenes, International Journal of Engineering and Innovative Technology (IJEIT), vol. 2, Issue 4, October 2012 ”or the process described in the document“ Extraction and Recognition ofArticicial Text in Multimedia Documents, C. Wolf J.M. Jolion, Formal Pattern Analysis & Applications, Feb. 2004, Vol. 6, Issue 4, pp 309-326 ”. Step 531 makes it possible to obtain coordinates of at least one quadrilateral corresponding to a sensitive area in which a text is located.
Returning to FIG. 4, in a step 55, the processing module 300 divides each detected sensitive area of each image into a plurality of sub-parts and, for each sub-part, calculates a signature representative of the content of said sub-part. In one embodiment, each quadrilateral corresponding to a sensitive area detected is a rectangle and each rectangle is divided according to a regular tiling. For example, each sub-part is a rectangle whose length is a predefined percentage pl of the length of the rectangle representing the detected sensitive area in which the sub-part is located and whose width is a predefined percentage pL of the width of said rectangle . In one embodiment, the predefined percentages pl and pL are between "10%" and "33%".
In one embodiment, the signature calculated for each sub-part is a histogram of Orientation Gradients (“Histogram of Oriented Gradients (HOG)” in English terminology). This type of signature is particularly suitable for textured elements such as faces as shown in the document “Histogram of Oriented Gratient for human detection, N. Dalal andB. Triggs, 2005, IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) ”. Other signature types can however be used such as the LPB (Local Binary Patterns) descriptors described in the document “Local Binary Patterns and Its Application to Facial Image Analysis: A Survey, D. Huang et al, IEEE Transactions on Systems Man and Cybernetics Part C (Applications and Reviews) 41 (6): 765-781, November 2011 ”. Regarding the text zones, a text peak orientation histogram or a Fourrier descriptor can also advantageously be used.
In a step 56, the processing module 300 searches for each sub-part of the first image, a sub-part corresponding spatially in the second readjusted image. For each sub-part of the first image having a corresponding sub-part in the second readjusted image, the processing module calculates a dissimilarity measure, called local dissimilarity measure, between the corresponding sub-parts from the signatures of the sub- corresponding parts. When, for example, the signature is a gradient orientation histogram, the measure of local dissimilarity of a sub-part of the first image with a sub-part of the second image readjusted is a difference between the orientation histograms of gradients of each subpart. A gradient orientation histogram comprises, for each orientation of a plurality of predefined gradient orientations, a value, called a statistical value, representative of a number of pixels for which a gradient calculated at the position of these pixels follows said orientation. A difference between two gradient orientation histograms is obtained for example by calculating for each orientation appearing in the two histograms, a difference between the statistical values corresponding to this orientation and then calculating the sum of the absolute values of the differences calculated for each orientation.
Figs. 9 and 10 schematically illustrate a result of the division into subparties of sensitive zones detected respectively in the first image and in the second readjusted image.
The sensitive areas are represented in the form of rectangles in hatched lines. As can be seen in Figs. 9 and 10, there are substantially the same sensitive areas in the first image and in the second readjusted image. Each sensitive area is divided into a square-shaped sub-section. Depending on the value of the local dissimilarity measure calculated for each sub-part following the matching of the sub-parts of the first image with the sub-parts of the second readjusted image, the periphery of each sub-part appears more or less dark. A subpart that is very similar to a subpart with which it has been matched appears darker than a subpart which is very close to the subpart with which it has been matched. It is noted that a large majority of subparts have a clear periphery. However, it should be noted that the sensitive zones 900 and 1000 comprise very dark subparts. These very dark subparts correspond to the letter "N" of the first document which has been replaced by the letter "S" in the second document and to the letter "A" which has been added.
In one embodiment, during step 56, the processing module 300 applies a transformation in the form of a Gauss pyramid to the first image and to the second readjusted image. For each image, each sensitive area detected appears in each stage of the Gauss pyramid resulting from the transformation. The division of each sensitive area detected into a plurality of sub-parts and, for each sub-part, the calculation of a signature representative of the content of said sub-part, is then done in each stage of the Gauss pyramid. We then obtain a multiscale analysis of the first image and the second readjusted image.
In a step 57, the processing module 300 determines whether the first and the second documents are representative of fraud based on the calculated local dissimilarity measures.
Fig. 7 schematically illustrates an example of a method for determining whether two images correspond to the same document, to identical documents, or to the same document but with fraudulent alteration corresponding to step 57.
In a step 570, the processing module 300 determines a measure of dissimilarity DI such as an integral of a probability distribution of the measures of dissimilarity between the first image and the second readjusted image, said integral being calculated between the measure of dissimilarity DI and a maximum value of the dissimilarity measures in said distribution, is greater than a predetermined threshold SI. For example, SI equals "5%" of an integral of the probability distribution of the dissimilarity measures. The measure of dissimilarity DI can be considered as a measure of global dissimilarity representative of a dissimilarity between the document of the first image and the document of the second image.
In a step 571, the processing module 300 compares the dissimilarity measure DI with a predetermined threshold S2.
The first document is considered to be identical to the second document when the measure of dissimilarity DI is less than the predetermined threshold S2.
The predetermined threshold S2 is determined by learning using an image base comprising pairs of images representing identical documents and pairs of images in which one of the documents of the pair results from a falsification of the other document of the pair.
In one embodiment, other measures of overall dissimilarity can be used. For example, during step 570, the dissimilarity measure DI can be replaced by a dissimilarity measure DJ. The measure of dissimilarity DJ is calculated as the integral of the probability distribution of the measures of dissimilarity between the first image and the second image readjusted between a value a. d max and d max , where d max is the maximum value of the dissimilarity measures and a is a weight included in a set of values] 0.1 [. For example a = 0.5. In this case, the predetermined threshold S2 is adapted to the measure of dissimilarity DJ.
If the first document is identical to the second document, the processing module 300 displays a message on the display module indicating that the two documents compared are identical during a step 572. In this case, the processing module 300 considers that 'there has been no forgery and the method of FIG. 4 ends.
If the processing module 300 determines that the two documents are different, it must determine how different the two documents are. To do this, during steps 573 to 578 the processing module 300 determines a level of difference between the first and the second documents as a function of a value representative of a proportion of pixels of the first image located in sensitive areas. belonging to a sub-part, having a corresponding sub-part in the second image, the measure of dissimilarity of which is greater than a predetermined threshold S3.
In one embodiment, the processing module 300 compares the measure of dissimilarity of each sub-part of the first image matched with a sub-part of the second image with the predetermined threshold S3. The threshold S3 is determined by learning using an image base comprising pairs of images representing identical documents and pairs of images in which one of the documents of the pair results from a falsification of the other document of the pair. . For each pair of images in the base, a distribution of the dissimilarity measures is calculated. The threshold S3 is then determined so as to ensure good separability of the distributions of the pairs of identical images and of the pairs of images comprising a falsified image. A classification of the ROC curve type (receiver operating characteristic, “receiver operating characteristic” in English terminology) can then be used to ensure this good separability.
During a step 573, the pixels of the first image (or of the second image) of the sub-parts having a dissimilarity measure below the threshold S3 are considered to be identical and set to the value "0". The pixels of the first image (or of the second image) of the subparts having a dissimilarity measure greater than or equal to the threshold S3 are considered to be different and set to the value "1". All other pixels of the first (or second image) are set to a value different from "0" and "1".
In a step 574, an analysis in related components is applied to the first image thus modified in order to aggregate in the same region of the pixels of said image equal to "1". The analysis in related components makes it possible to obtain in the first image, a set of regions, called dissimilar regions, each comprising pixels having the value "1".
In steps 575 and 577, the processing module 300 calculates a dissimilarity indicator ID representative of the level of difference between the first and the second documents. To do this, during step 575, the processing module 300 calculates the area of each dissimilar region (the area of a region can be represented by a number of pixels in the region).
The processing module 300 then calculates, during step 577, a sum o rd of the areas of each dissimilar region and divides this sum by a sum o zs of the areas of each sensitive zone detected in order to obtain the dissimilarity indicator ID:
ID = - & ZS
In a step 578, the processing module 300 compares the dissimilarity indicator ID with a predetermined threshold S4 in order to determine whether there has been an attempt at document fraud. The threshold S4 is determined by learning using an image base comprising pairs of images representing identical documents and pairs of images in which one of the documents of the pair results from a falsification of the other document of the pair. . An indicator of dissimilarity ID is calculated for each pair of the image base. A histogram of the ID dissimilarity indicators is then calculated for each type of pair, each histogram being representative of the probability distribution of the ID dissimilarity indicators. In each histogram, each dissimilarity indicator is associated with a probability that this dissimilarity indicator will occur. For each value of dissimilarity indicator ID, a probability of false alarm for the pairs of images representing identical documents, and a probability of non-detection of fraud for the pairs of images in which one of the documents of the pair results from forgery of the other document in the pair is calculated. Using these probabilities, a classification of the ROC curve type is used to find a threshold value S4 making it possible to obtain good separability between the dissimilarity indicators representative of fraud and the dissimilarity indicators representative of different documents but not resulting from fraud.
If the dissimilarity indicator ID calculated during step 577 is less than S4, the processing module 300 decides during a step 579 that it is in the presence of an attempted fraud and displays a message on the module display 301 indicating this attempted fraud. Otherwise, the processing module 300 decides that there has been no fraud attempt during a step 580 and displays a message indicating this on the display module 301.
In one embodiment, during a step 576, intermediate between steps 575 and 577, the processing module 300 eliminates each dissimilar region having a dimension less than a predetermined dimension A, ie the pixels of these regions are set to " 0 ". For example, the predetermined dimension A is a height and the processing module 300 eliminates the dissimilar regions located in a sensitive area comprising text having a height less than "A = 10" pixels or less than "A = 5" pixels. Indeed, below such a height, it becomes difficult to analyze a text reliably. In another example, the predetermined dimension A is an area and the processing module 300 eliminates dissimilar regions having an area less than an "A = 100" square pixels. Eliminating dissimilar regions having a small dimension makes it possible to reduce an impact of an insignificant dissimilarity due, for example, to a residual registration error or due to noise in the images.
In one embodiment, the area of each dissimilar region is weighted according to the importance of said dissimilar region. The sum of the areas of dissimilar regions is therefore a weighted sum. Similarly, in this embodiment, the sum of the areas of the sensitive areas detected is a weighted sum according to the importance of each sensitive area. For example, when a dissimilar region is located in a sensitive zone comprising a face, the area of this region receives a first predetermined weight greater than a second predetermined weight assigned to an area of a dissimilar region included in a zone sensitive including text. When the text area detection method executed during step 531 makes it possible to differentiate the customizable text areas from the non-customizable text areas in a document, the area of a dissimilar region belonging to a customizable text area receives a third predetermined weight greater than a fourth predetermined weight assigned to an area of a dissimilar region belonging to a non-customizable text area.
In one embodiment, in a step 51, intermediate between step 50 and step 52, the processing module 300 applies a segmentation method to the first and to the second images. For example, the segmentation method makes it possible to detect quadrilaterals in an image, which is consistent with rectangular documents seen in perspective in an image. Following the application of the segmentation method, each quadrilateral found in an image is considered to be part of the document included in said image. For example, the processing module 300 applies for this a segmentation method described in the document "Jung, CR, and R. Schramm, Rectangle Detection based on a Windowed Hough Transform, Brazilian Symposium on Computer Graphics and Image Processing (SIBGRAPI), Curitiba, 2004 ”or a segmentation method described in the document“ Fan, Jian, Detection of quadrilateral document regions from digital photographs. WACV2016 ”. These two methods make it possible to locate the first and second documents in the first and second images respectively. Thus, each processing carried out in the rest of the process (corresponding to steps 52, 53, 55 and 56) is only applied to part of the first and second images containing the first and second documents respectively and not to all of the first and second images. In this way, the complexity of the process is reduced.
Other segmentation methods more suited to documents having undergone more complex transformations can however be used.
In one embodiment, the step 53 of detecting areas susceptible to document fraud in the first image and in the second readjusted image is executed only in regions of the first image and of the second readjusted image comprising dots. interest detected during step 52. For example, the processing module 300 determines in the first image (respectively in the second readjusted image) a convex envelope encompassing all the points of interest detected in this image. The sensitive zones are then sought in each determined convex envelope. In this way, the complexity of the process is reduced.
权利要求:
Claims (14)
[1" id="c-fr-0001]
1) Method for detecting fraud, documentary characterized in that it includes:
obtaining (50) a first image of a first document and a second image of a second document;
applying (52) an image registration procedure to the second image in order to register it on the first image, the registration procedure being based on a matching of points of interest identified in the first and second images;
applying (53) a procedure for detecting areas susceptible to documentary fraud in the first image and in the second readjusted image;
dividing (55) each sensitive area detected into a plurality of sub-parts and, for each sub-part, calculating a signature representative of the content of said sub-part;
for each sub-part of the first image, search for a corresponding sub-part spatially in the second readjusted image, and for each sub-part of the first image having a corresponding sub-part in the second image, calculate (56) a measurement local dissimilarity between the corresponding sub-parts from the signatures;
determining that the first and second documents are identical (572) when a determined global dissimilarity measure (570) from a probability distribution of the local dissimilarities measures is less than a first predetermined threshold and, when the first document is different from the second document;
determining (577) a level of difference between the first and the second document as a function of a value representative of a proportion of pixels of the first image located in sensitive areas belonging to a sub-part having a corresponding sub-part in the second image whose dissimilarity measure is greater than a second predetermined threshold; and, detecting fraud (579) when the level of difference is less than a third predetermined threshold.
[2" id="c-fr-0002]
2) Method according to claim 1, characterized in that the registration procedure comprises:
determining that no fraud has been detected when a value representative of an efficiency of matching points of interest between the first and second images is less than a predetermined threshold.
[3" id="c-fr-0003]
3) Method according to claim 1 or 2, characterized in that the procedure for detecting sensitive areas comprises a procedure for detecting a face in the first image and the second image readjusted and / or a procedure for detecting text areas in the first image and the second image readjusted.
[4" id="c-fr-0004]
4) Method according to claim 1, 2 or 3, characterized in that, the global dissimilarity measure is such that an integral of a probability distribution of the local dissimilarity measures between the first image and the second readjusted image, said integral being calculated between the global dissimilarity measure and a maximum value of the local dissimilarity measures in said distribution, is greater than a fourth predetermined threshold, said fourth predetermined threshold being equal to a second predetermined percentage of an integral of the probability distribution local dissimilarity measures.
[5" id="c-fr-0005]
5) Method according to any one of the preceding claims, characterized in that to determine a level of difference between the first and the second document the method comprises:
compare (573) the measure of dissimilarity of each sub-part of the first image matched with a sub-part of the second image with the second predetermined threshold and classify the pixels of an image, called the intermediate image, taken from among the first or second images, belonging to sub-parts for which the dissimilarity measure is greater than the second predetermined threshold in a class of pixels considered to be dissimilar;
forming (574) regions of dissimilar pixels from the pixels classified in the class of pixels considered to be dissimilar;
calculating (575, 577) a sum of the areas of regions of dissimilar pixels and obtaining a dissimilarity indicator representative of said level of difference by dividing this sum by a sum of the areas of sensitive areas.
[6" id="c-fr-0006]
6) Method according to claim 5, characterized in that, each sum is a weighted sum, each area being weighted with a predetermined weight depending on an importance assigned to the region of dissimilar pixels or to the sensitive area corresponding to the area calculated.
[7" id="c-fr-0007]
7) Method according to claim 6, characterized in that the importance assigned to a region of dissimilar pixels or to a sensitive area is predefined according to a type of content of the area.
[8" id="c-fr-0008]
8) Method according to claim 5, 6 or 7, characterized in that the regions of dissimilar pixels whose dimension is less than a predetermined dimension are not taken into account in the calculation of the sum of the areas of the regions of dissimilar pixels.
[9" id="c-fr-0009]
9) Method according to any one of the preceding claims, characterized in that the method comprises: applying (51) a segmentation procedure to the first and to the second images, said segmentation procedure making it possible to detect quadrilaterals in an image, each quadrilateral found in an image being considered to be part of the document contained in said image, the image registration procedure and the procedure for detecting sensitive areas being carried out in the quadrilaterals identified by the segmentation procedure.
[10" id="c-fr-0010]
10) Method according to any one of the preceding claims, characterized in that the detection of areas susceptible to documentary fraud in the first image and in the second readjusted image is executed in portions of the first image and of the second readjusted image including matched points of interest
[11" id="c-fr-0011]
11) Method according to any one of the preceding claims, characterized in that each sensitive area is a rectangle and is divided (55) according to a regular tiling, each sub-part being a rectangle whose length is a third predefined percentage of a length of the rectangle representing the sensitive area detected in which the sub-part is located and whose width is a fourth predefined percentage of a width of said rectangle.
[12" id="c-fr-0012]
12) Device for detecting documentary fraud, characterized in that it includes:
obtaining means (50) for obtaining a first image of a first document and a second image of a second document;
processing means (52) for applying an image registration procedure to the second image in order to register it on the first image, the registration procedure being based on a matching of points of interest identified in the first and the second image;
processing means (53) for applying a procedure for detecting areas sensitive to document fraud in the first image and in the second readjusted image;
processing means (55) for dividing each sensitive area detected into a plurality of sub-parts and, for each sub-part, for calculating a signature representative of the content of said sub-part;
search means for searching, for each sub-part of the first image, a spatially corresponding sub-part in the second readjusted image, and for each sub-part of the first image having a corresponding sub-part in the second image, calculating means (56) for calculating a measure of local dissimilarity between the corresponding sub-parts from the signatures;
determination means for determining that the first and second documents are identical (572) when a determined global dissimilarity measure (570) from a probability distribution of the local dissimilarity measures is less than a first predetermined threshold;
determination means (577) for determining, when the first document is different from the second document, a level of difference between the first and the second document as a function of a value representative of a proportion of pixels of the first image located in sensitive areas belonging to a sub-part having a corresponding sub-part in the second image whose dissimilarity measure is greater than a second predetermined threshold; and, detecting means (579) for detecting fraud when the level of difference is less than a third predetermined threshold.
[13" id="c-fr-0013]
13) Computer program, characterized in that it comprises instructions for implementing, by a device (300), the method according to any one of claims 1 to 11, when said program is executed by a processor of said device.
[14" id="c-fr-0014]
14) Storage means, characterized in that they store a computer program comprising instructions for implementing, by a device (300), the method according to any one of claims 1 to 11, when said program is executed by a processor of said device.
类似技术:
公开号 | 公开日 | 专利标题
KR20180109171A|2018-10-08|Liveness test method and apparatus for
Kraetzer et al.2017|Modeling attacks on photo-ID documents and applying media forensics for the detection of facial morphing
Scherhag et al.2019|Detection of face morphing attacks based on PRNU analysis
EP2869240A2|2015-05-06|Digital fingerprinting object authentication and anti-counterfeiting system
US11210495B2|2021-12-28|Systems, methods and computer-accessible mediums for authentication and verification of physical objects
EP3637379B1|2021-04-21|Method and apparatus for detecting fradulent documents
Raghavendra et al.2015|Presentation attack detection algorithms for finger vein biometrics: A comprehensive study
JP2020525947A|2020-08-27|Manipulated image detection
Debiasi et al.2018|PRNU variance analysis for morphed face image detection
EP3382605A1|2018-10-03|Method for analysing a deformable structured document
Vidya et al.2019|Entropy based Local Binary Pattern | feature extraction technique of multimodal biometrics as defence mechanism for cloud storage
Benlamoudi et al.2015|Face spoofing detection using local binary patterns and Fisher score
Yeh et al.2018|Face liveness detection based on perceptual image quality assessment features with multi-scale analysis
Isaac et al.2016|A key point based copy-move forgery detection using HOG features
Waris et al.2013|Analysis of textural features for face biometric anti-spoofing
Isaac et al.2018|Image forgery detection using region–based Rotation Invariant Co-occurrences among adjacent LBPs
Maser et al.2019|Prnu-based detection of finger vein presentation attacks
Benlamoudi et al.2015|Face spoofing detection from single images using active shape models with stasm and lbp
Rajan et al.2018|An extensive study on currency recognition system using image processing
Bharadwaja2015|The analysis of online and offline signature verification techniques to counter forgery
Akintoye et al.2018|Challenges of finger vein recognition system: a theoretical perspective
US11068693B1|2021-07-20|Liveness detection in fingerprint-based biometric systems
Tică et al.2013|A method for automatic coin classification
Tică et al.2014|Automatic Coin Classification
Gupta et al.2017|Exploiting noise and textural features for passive image forensics
同族专利:
公开号 | 公开日
US10936866B2|2021-03-02|
EP3637379A1|2020-04-15|
US20200110932A1|2020-04-09|
EP3637379B1|2021-04-21|
FR3086884B1|2020-11-27|
引用文献:
公开号 | 申请日 | 公开日 | 申请人 | 专利标题
US20120324534A1|2009-11-17|2012-12-20|Holograms Industries|Method and system for automatically checking the authenticity of an identity document|
US20150206372A1|2012-08-22|2015-07-23|Shandong New Beiyang Information Technology Co., Ltd.|Paper money identification method and device|
FR3047688A1|2016-02-11|2017-08-18|Morpho|METHOD OF SECURING AND VERIFYING A DOCUMENT|
US20180285639A1|2017-03-30|2018-10-04|Idemia Identity & Security France|Method for analyzing a structured document likely to be deformed|
ES2424480T3|2002-05-14|2013-10-02|Schreiner Group Gmbh & Co. Kg|Authentication patterns visible for printed document|
US8260061B2|2007-09-21|2012-09-04|Sharp Kabushiki Kaisha|Image data output processing apparatus and image data output processing method|
JP4486987B2|2007-09-26|2010-06-23|シャープ株式会社|Image data output processing apparatus, image data output processing method, program, and recording medium|
US20160210621A1|2014-12-03|2016-07-21|Sal Khan|Verifiable credentials and methods thereof|
RU2668717C1|2017-12-13|2018-10-02|Общество с ограниченной ответственностью "Аби Продакшн"|Generation of marking of document images for training sample|US11210507B2|2019-12-11|2021-12-28|Optum Technology, Inc.|Automated systems and methods for identifying fields and regions of interest within a document image|
US11227153B2|2019-12-11|2022-01-18|Optum Technology, Inc.|Automated systems and methods for identifying fields and regions of interest within a document image|
法律状态:
2019-09-19| PLFP| Fee payment|Year of fee payment: 2 |
2020-04-10| PLSC| Publication of the preliminary search report|Effective date: 20200410 |
2020-09-17| PLFP| Fee payment|Year of fee payment: 3 |
2021-09-22| PLFP| Fee payment|Year of fee payment: 4 |
优先权:
申请号 | 申请日 | 专利标题
FR1859345A|FR3086884B1|2018-10-09|2018-10-09|DOCUMENTARY FRAUD DETECTION PROCESS.|FR1859345A| FR3086884B1|2018-10-09|2018-10-09|DOCUMENTARY FRAUD DETECTION PROCESS.|
US16/585,548| US10936866B2|2018-10-09|2019-09-27|Method for detecting document fraud|
EP19201481.9A| EP3637379B1|2018-10-09|2019-10-04|Method and apparatus for detecting fradulent documents|
[返回顶部]